AN S 562A: Linear Mixed models

An introduction to Linear Mixed Models

Juan Steibel, Austin Putz

ABG-Iowa State University

Learning objectives

  • Understand the basic linear mixed model

    • Basic model equation

    • Random or Fixed?

    • Mixed Model Equations (MME)

    • BLUP and its properties

  • Apply Matrix functions in R/Julia to solve MME

  • Apply R/Julia functions to fit linear mixed effects models

Basic Linear Mixed Model

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] where, \(\boldsymbol{y}\) is the vector of observations
\(\boldsymbol{X}\) is the incidence matrix for fixed effects
\(\boldsymbol{\beta}\) is the vector of fixed effects
\(\boldsymbol{Z}\) is the incidence matrix for random effects
\(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) is the vector of random effects
\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\) is the vector of random residuals

Let’s take some time to discuss the difference between random and fixed effects

Quick review of properties of Gaussian Distributions

If: \(\boldsymbol{x} \sim N(\boldsymbol{\mu},\boldsymbol{S})\), and =, then: \[ \boldsymbol{y}=\sim N(\boldsymbol{K}\boldsymbol{\mu},\boldsymbol{K}\boldsymbol{S}\boldsymbol{K}^{'}) \] Conditional Distributions, if: \[ \begin{bmatrix} \boldsymbol{x} \\ \boldsymbol{z} \end{bmatrix} \sim N\left( \begin{bmatrix} \boldsymbol{\mu} \\ \boldsymbol{\nu} \end{bmatrix}, \begin{bmatrix} \boldsymbol{S} & \boldsymbol{C} \\ \boldsymbol{C} ^{'}& \boldsymbol{W} \end{bmatrix} \right) \]

Then: \[ \boldsymbol{x}|\boldsymbol{z}\sim N(\boldsymbol{\mu}+\boldsymbol{C}\boldsymbol{W}^{-1}(\boldsymbol{z}-\boldsymbol{\nu}),\boldsymbol{C}\boldsymbol{W}^{-1}\boldsymbol{C}^{'}) \]

Distributional assumptions

The model assumptions are summarized in the following equation: \[ \begin{bmatrix} \boldsymbol{u} \\ \boldsymbol{e} \end{bmatrix} \sim N\left( \begin{vmatrix} \boldsymbol{0} \\ \boldsymbol{0} \end{vmatrix}, \begin{bmatrix} \boldsymbol{G} & \boldsymbol{0} \\ \boldsymbol{0} & \boldsymbol{R} \end{bmatrix} \right) \]

Which implies: \[ \begin{bmatrix} \boldsymbol{u} \\ \boldsymbol{e} \\ \boldsymbol{y} \end{bmatrix} \sim N\left( \begin{vmatrix} \boldsymbol{0} \\ \boldsymbol{0} \\ \boldsymbol{X}\boldsymbol{\beta} \end{vmatrix}, \begin{bmatrix} \boldsymbol{G} & \boldsymbol{0}& \boldsymbol{ZG} \\ \boldsymbol{0} & \boldsymbol{R} & \boldsymbol{R} \\ \boldsymbol{GZ}^{'} & \boldsymbol{R} & \boldsymbol{ZGZ}^{'}+\boldsymbol{R} \end{bmatrix} \right) \] Show proof of this

Example: birthweights of pigs

The weights of 6 pigs at birth is presented together witht their sex and their dam. \[ \boldsymbol{y}=\begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}, \boldsymbol{X}_1= \begin{bmatrix}1&0 \\0&1 \\0&1 \\0&1 \\0&1 \\0&1 \\\end{bmatrix}, \boldsymbol{X}_2= \begin{bmatrix}1&0 \\1&0 \\1&0 \\0&1 \\0&1 \\0&1 \\\end{bmatrix} \] Which of these two should be considered random and why?

Example 1 continued:

\[ \boldsymbol{y}=\begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}, \boldsymbol{X}= \begin{bmatrix}1&0 \\0&1 \\0&1 \\0&1 \\0&1 \\0&1 \\\end{bmatrix}, \boldsymbol{Z}= \begin{bmatrix}1&0 \\1&0 \\1&0 \\0&1 \\0&1 \\0&1 \\\end{bmatrix} \]

if dam is considered random, we need to make an assumption about their variance. And about the residual variance too.

\[ var\left( \begin{bmatrix} u_{474} \\ u_{475} \end{bmatrix} \right)=\boldsymbol{G}=\boldsymbol{I} \times 0.016; var(\boldsymbol{e})=\boldsymbol{R}=\boldsymbol{I} \times 0.08 \] Notice: \(\sigma^{2}_u= 0.016\), and \(\sigma^{2}_e= 0.08\) What other plausible assumptions can we make? (not about the value of the variances, but about the structure of G and R)

How would these assumptions affect the final results?

How about changing \(\sigma^{2}_u\), and \(\sigma^{2}_e\)?

The solution to this problem: BLUE and BLUP

  • Dr. Charles Roy Henderson (graduate student with Hazel and Lush at ISU) proposed a method to solve this in his PhD thesis the late 1940s. He later revisited the computation proposing a easier solution
    \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] whith: \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) is the vector of random effects
    \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\) is the vector of random residuals
    Find:

  • BLUE: Linear estimate of \(\boldsymbol{\beta}\), \(\boldsymbol{\hat{\beta}}\) such that: \(E(\boldsymbol{\hat{\beta}})=\boldsymbol{\beta}\) (i.e: unbiased) and \(Var(\boldsymbol{\hat{\beta}})\) is the minimum among all unbiased estimates (best).

  • BLUP: Linear predictor of \(\boldsymbol{u}\), \(\boldsymbol{\hat{u}}\) such that: \(E(\boldsymbol{\hat{u}}-\boldsymbol{u})=\boldsymbol{0}\) (i.e: unbiased) and \(Var(\boldsymbol{\hat{u}}-\boldsymbol{u})\) is the minimum among all unbiased estimates (best).

Notice how the definition of unbiased is different for random and fixed effects

The mixed model equations

The mixed model equations are a simple and elegant for of obtaining BLUE and BLUP

\[ \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z}+\boldsymbol{G}^{-1} \end{bmatrix} \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \end{bmatrix} \]

Let’s discuss this in class.

  • How to solve this system of equations?

  • It’s common that \(\boldsymbol{Z}\) is not full rank. Why can we still solve it?

  • What is the difference betweem treating an effect as fixed or random in terms of this systems of equations?

A simpler model

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with

\(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{A} \sigma^2_u)\): the vector of random effects

\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\): the vector of random residuals

The MME can be re-written as: \[ \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \]

with \(\lambda=\sigma^2_e/\sigma^2_u\)

What is the effect of the residual and random effect variance?

\[ \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \]

\[ \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \left(\begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \right)^{-1} \times \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \]

All that matters is \(\lambda\)!

Revisit example

For the example of pig birth weights:

\[ \boldsymbol{X}^{'}\boldsymbol{X}=\begin{bmatrix}1&0 \\0&5 \\\end{bmatrix}, \boldsymbol{X}^{'}\boldsymbol{X}= \begin{bmatrix}1&0 \\2&3 \\\end{bmatrix}, \boldsymbol{Z}^{'}\boldsymbol{Z}=\begin{bmatrix}3&0 \\0&3 \\\end{bmatrix} ,\boldsymbol{X}^{'}\boldsymbol{Y}=\begin{bmatrix}1.72 \\7.34 \\\end{bmatrix} ,\boldsymbol{Z}^{'}\boldsymbol{y}=\begin{bmatrix}4.66 \\4.4 \\\end{bmatrix} \] \[ var\left( \begin{bmatrix} u_{474} \\ u_{475} \end{bmatrix} \right)=\boldsymbol{G}=\boldsymbol{I} \times 0.016; var(\boldsymbol{e})=\boldsymbol{R}=\boldsymbol{I} \times 0.08; \lambda=5.0 \] \[ \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \left(\begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{Z}+\boldsymbol{A}^{-1}\lambda \end{bmatrix} \right)^{-1} \times \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{y} \end{bmatrix} \] \[ \begin{bmatrix}1&0&1&0 \\0&5&2&3 \\1&2&8&0 \\0&3&0&8 \\\end{bmatrix}^{-1} \times \begin{bmatrix}1.72 \\7.34 \\4.66 \\4.4 \\\end{bmatrix} = \begin{bmatrix}1.719 \\1.468 \\0.001 \\-0.001 \\\end{bmatrix} \] Let’s study the effect of \(\lambda\) in the BLUE and BLUP

Variance ratio and solutions to MME

Estimates of sex (fixed), sexM-SexF, dam (random) and Dam1-Dam2 for different values of variance ratio \[ \begin{bmatrix}lambda&sexF&sexM&dam474&dam475&dif_sex&dif_dam \\1e-04&1.7183&1.4683&0.0017&-0.0017&0.25&0.0033 \\0.05&1.7184&1.4683&0.0016&-0.0016&0.25&0.0033 \\0.1&1.7184&1.4683&0.0016&-0.0016&0.2501&0.0032 \\0.3&1.7185&1.4683&0.0015&-0.0015&0.2502&0.003 \\0.5&1.7186&1.4683&0.0014&-0.0014&0.2503&0.0028 \\0.75&1.7187&1.4683&0.0013&-0.0013&0.2505&0.0025 \\1&1.7188&1.4682&0.0012&-0.0012&0.2506&0.0024 \\5&1.7195&1.4681&5e-04&-5e-04&0.2514&0.0011 \\10&1.7197&1.4681&3e-04&-3e-04&0.2516&6e-04 \\20&1.7198&1.468&2e-04&-2e-04&0.2518&4e-04 \\\end{bmatrix} \]

Shrinkage towards the mean (zero)

As the value of \(\lambda\) increases, there the predicted random effects are squeezed towards zero. In the extreme case in which the variance ratio is zero, the obtained predictions are the OLS solutionsand when the variance ratio is very large, the predicted values are zero.

\[ \begin{bmatrix}lambda&sexF&sexM&dam474&dam475&dif_sex&dif_dam \\1e-04&1.7183&1.4683&0.0017&-0.0017&0.25&0.0033 \\20&1.7198&1.468&2e-04&-2e-04&0.2518&4e-04 \\\end{bmatrix} \] This is a well known property in the prediction of random effects.

Generalized least squares equivalence to Mixed model solution

Rewrite the model as: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{\epsilon} \]

with: \[ \boldsymbol{\epsilon} \sim N(\boldsymbol{0},\boldsymbol{V}) \] and: \[ \boldsymbol{V}=\boldsymbol{ZGZ}^{'}+\boldsymbol{R} \] The GLS solutions are also BLUE of \(\boldsymbol{\beta}\): \[ \boldsymbol{\hat{\beta}}=(\boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{X})^{-1}\boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{y} \]

Application to the example:

\[ \boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{X}=\begin{bmatrix}10.9375&-3.125 \\-3.125&42.1875 \\\end{bmatrix}, ((\boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{X})^{-1}=\begin{bmatrix}0.0934&0.0069 \\0.0069&0.0242 \\\end{bmatrix} \]

\[ \boldsymbol{X}^{'}\boldsymbol{V}^{-1}\boldsymbol{y}=\begin{bmatrix}14.22 \\56.56 \\\end{bmatrix}, \boldsymbol{\hat{\beta}}=\begin{bmatrix}1.7195 \\1.4681 \\\end{bmatrix} \]

These results are identical to those obtained in slide 11. This also offers a way, for instance, to perform hypothesis testing for fixed effects when accounting for the co-variance between observations.

prediction of with GLS

\[ \begin{bmatrix} \boldsymbol{u} \\ \boldsymbol{e} \\ \boldsymbol{y} \end{bmatrix} \sim N\left( \begin{vmatrix} \boldsymbol{0} \\ \boldsymbol{0} \\ \boldsymbol{X}\boldsymbol{\beta} \end{vmatrix}, \begin{bmatrix} \boldsymbol{G} & \boldsymbol{0}& \boldsymbol{ZG} \\ \boldsymbol{0} & \boldsymbol{R} & \boldsymbol{R} \\ \boldsymbol{GZ}^{'} & \boldsymbol{R} & \boldsymbol{V}=\boldsymbol{ZGZ}^{'}+\boldsymbol{R} \end{bmatrix} \right) \] obtain \(E(\boldsymbol{u}|\boldsymbol{y})\): \[ E(\boldsymbol{u}|\boldsymbol{y})=E(\boldsymbol{u})+cov(\boldsymbol{u},\boldsymbol{y})var(\boldsymbol{y})^{-1}(\boldsymbol{y}-E(\boldsymbol{y}))=\boldsymbol{GZ^{'}}\boldsymbol{V}^{-1}(\boldsymbol{y}-\boldsymbol{X\hat{\beta}}) \]

Example: \[ \boldsymbol{Z}^{'}=\begin{bmatrix}1&1&1&0&0&0 \\0&0&0&1&1&1 \\\end{bmatrix}; \boldsymbol{G}=\begin{bmatrix}0.02&0 \\0&0.02 \\\end{bmatrix}; \boldsymbol{y}^=\begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}; \boldsymbol{X\beta}=\begin{bmatrix}1.72 \\1.47 \\1.47 \\1.47 \\1.47 \\1.47 \\\end{bmatrix} \]

GLS Example continued

\[ \boldsymbol{V }^{-1}=\begin{bmatrix}10.94&-1.56&-1.56&0&0&0 \\-1.56&10.94&-1.56&0&0&0 \\-1.56&-1.56&10.94&0&0&0 \\0&0&0&10.94&-1.56&-1.56 \\0&0&0&-1.56&10.94&-1.56 \\0&0&0&-1.56&-1.56&10.94 \\\end{bmatrix} \] \[ \boldsymbol{GZ^{'}}\boldsymbol{V}^{-1}(\boldsymbol{y}-\boldsymbol{X\hat{\beta}})= \] \[ \begin{bmatrix}0.12&0.12&0.12&0&0&0 \\0&0&0&0.12&0.12&0.12 \\\end{bmatrix} \times \left( \begin{bmatrix}1.72 \\1.45 \\1.49 \\1.36 \\1.54 \\1.5 \\\end{bmatrix}-\begin{bmatrix}1.72 \\1.47 \\1.47 \\1.47 \\1.47 \\1.47 \\\end{bmatrix} \right) =\begin{bmatrix}5e-04 \\-5e-04 \\\end{bmatrix} \]

More equivalent models

A model of this form: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] where, \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) is the vector of random effects
\(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\) is the vector of random residuals
Let’s apply the Cholesky decompositon of \(\boldsymbol{G}\) buy obtaining a lower triangular matrix $ such that: \[ \boldsymbol{G}=\boldsymbol{T}\boldsymbol{T}^{'} \] Then, replace \(\boldsymbol{u}\) with \(\boldsymbol{u}=\boldsymbol{T}\boldsymbol{u}^{*}\): \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{T}\boldsymbol{u}^{*}+\boldsymbol{e}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}^{*}\boldsymbol{u}^{*}+\boldsymbol{e} \] and \(\boldsymbol{u}^{*} \sim N(\boldsymbol{0},\boldsymbol{I})\)

The two models are equivalent in the sense that the observations have the same distribution.

More equivalent models

The equivalency between this: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with: \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{R})\)

And this: \[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}^{*}\boldsymbol{u}^{*}+\boldsymbol{e} \] with \(\boldsymbol{u}^{*} \sim N(\boldsymbol{0},\boldsymbol{I})\)

Means that if we can fit independent random effects models, then we can fit any linear mixed model because we can use the “Cholesky trick” to re-compute the incidence matrix and fit an equivalent model.

Note: This may be computationally inneficient!

Variance of predictors

We saw: \[ \begin{bmatrix} \boldsymbol{\hat{\beta}} \\ \boldsymbol{\hat{u}} \end{bmatrix} = \left(\begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{X} & \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{Z}+\boldsymbol{G}^{-1} \end{bmatrix} \right)^{-1} \times \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \end{bmatrix} = \begin{bmatrix} \boldsymbol{C}_{XX} & \boldsymbol{C}_{XZ} \\ \boldsymbol{C}_{ZX} & \boldsymbol{C}_{ZZ} \end{bmatrix} \begin{bmatrix} \boldsymbol{X}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \\ \boldsymbol{Z}^{'}\boldsymbol{R}^{-1}\boldsymbol{y} \end{bmatrix} \] The elements \(\boldsymbol{C}_{XX}\), \(\boldsymbol{C}_{XZ}\), \(\boldsymbol{C}_{ZZ}\) are key matrices in LMM as they are used to derive variances of predictors:

\[ Var(\boldsymbol{\hat{\beta}})=\boldsymbol{C}_{XX} \] and \[ PEV=var(\boldsymbol{\hat{u}}-\boldsymbol{u})=\boldsymbol{C}_{ZZ} \] and

\[ var(\boldsymbol{\hat{u}})=\boldsymbol{G}-\boldsymbol{C}_{ZZ} \] Now, discuss assymptotic values in class.

This Course at a glance

Now, we can describe in more detail the content of this course, perusing a general version of the mixed model that we shown before and changing it into a few particular cases:

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\sum_{k=1}{\boldsymbol{Z}_k\boldsymbol{u}_k}+\boldsymbol{e} \] with: \(\boldsymbol{u}_k \sim N(\boldsymbol{0},\boldsymbol{G}_k)\).

The Sire Model

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{A} \sigma^2_u)\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\).

where: \(\boldsymbol{u}\) represents the vector of sire effects and is a matrix linking the phenotype of each animal (\(\boldsymbol{y}\)) to its corresponding random sire effect.

Also, represents a relationship matrix between sires and \(\sigma\^2_u\) is the sire variance.

The Animal Model

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{A} \sigma^2_u)\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\).

where: \(\boldsymbol{u}\) represents the vector of animal effects and is a matrix linking the phenotype of each animal (\(\boldsymbol{y}\)) to its corresponding “breeding value”.

Also, \(\boldsymbol{A}\) represents a relationship matrix and \(\sigma^2_u\) is the genetic additive variance.

Random Regression models

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\boldsymbol{Z}\boldsymbol{u}+\boldsymbol{e} \] with \(\boldsymbol{u} \sim N(\boldsymbol{0},\boldsymbol{G})\) and \(\boldsymbol{e} \sim N(\boldsymbol{0},\boldsymbol{I} \sigma^2_e)\).

where: \(\boldsymbol{u}\) represents a vector of random individual regression coefficient and is a matrix linking the phenotype of each in a specific timepoint to the animal regression coefficients.

Also, \(\boldsymbol{G}\) will have a special structure to account for covariances between animals and between coefficients within animals

Indirect genetic effects models

\[ \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}+\sum_{k=1}{\boldsymbol{Z}_k\boldsymbol{u}_k}+\boldsymbol{e} \] Sometimes, like in materal effects models and in social genetic effects models, some random effects will have zeroes in the diagonal of the corresponding \(\boldsymbol{Z}_k\) because the corresponding random effect does not affect the individual’s own record, but the record of an animal that interacts with them. E.g.: it’s mother.

These indirect genetic effect models are used in many genetic evaluations.

Unknown variance components

In the first 1/2 of the semester we assume that variance components are known. But in the last part of the semester we will learn methods to estimate variance components.

But, here is the trick with unknown Variance Components, when they are estimated from the same data where the BLUPS of \(\boldsymbol{u}\) are obtained:

  • Best: It’s not guaranteed to be the lowest variance solution

  • Linear: They are not linear function of the data because variances are now

  • Unbiased: The most commonly used method to estimate variance component gives biased results, so…

  • Predictor: It’s still A predictor 8^D

In fact, even with estimated variance components, and with reasonably large datasets relative to model size, the departure from this properties will likely be minimal.